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Abstract: Gelatio is a new software framework for advanced data analysis and digital signal 
processing developed for the Gerda neutrinoless double beta decay experiment. The framework 
is tailored to handle the full analysis flow of signals recorded by high purity Ge detectors and 
photo-multipliers from the veto counters. It is designed to support a multi-channel modular and 
flexible analysis, widely customizable by the user either via human-readable initialization files or 
via a graphical interface. The framework organizes the data into a multi-level structure, from the 
raw data up to the condensed analysis parameters, and includes tools and utilities to handle the 
data stream between the different levels. Gelatio is implemented in C-i~i-. It relies upon ROOT 
and its extension Tam, which provides compatibility with PROOF, enabling the software to run in 
parallel on clusters of computers or many-core machines. It was tested on different platforms and 
benchmarked in several GERDA-related applications. A stable version is presently available for the 
Gerda Collaboration and it is used to provide the reference analysis of the experiment data. 

Keywords: Software architectures (event data models, frameworks and databases). Data 
processing methods. Gamma detectors (scintillators, CZT, HPG, Hgl etc). 
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1. Introduction 

The "GERmanium Detector Array" (Gerda) is an experiment looking for neutrinoless double beta 
decay of ^^Ge which is presently under commissioning at the underground Laboratori Nazionali 
del Gran Sasso of INFN, Italy HJ, 0]. The neutrinoless double beta decay is a process which 
violates by two units the lepton number conservation and is forbidden in the Standai^d Model. Its 
experimental observation would imply the Majorana nature of the neutrino, also providing a first 
measurement of the neutrino absolute mass scale. The experiment will use an aiTay of high-purity 
germanium (HPGe) detectors isotopically enriched in ^^Ge (about 40 kg are planned for the final 
configuration), aiming to achieve a substantial reduction in the background at the Q^g^ -value of the 
^^Ge decay with respect to the predecessor experiments [0, 0]. 

The background reduction in Gerda is obtained by an innovative design approach in which 
naked HPGe detectors are operated directly in ultra radio-pure liquid argon which acts as coolant 
material and as passive shielding against the external radiation. The cryogenic liquid is surrounded 
by an additional thick layer of ultra-pure water, which is effective in shielding external neutrons and 
y-rays. The water volume is instrumented with photo-multipliers and is operated as a Cherenkov 
detector to reject events due to high-energy muons. Part of the remaining background events can 
be identified by analyzing the HPGe signal shapes and applying pulse shape discrimination (PSD) 
techniques [0, 0» & ED- Moreover, the Gerda collaboration is testing in the R&D set-up called 
LArGe [HOl the liquid argon instrumentation with photo-multipliers (PMTs) which would provide 
an active veto system. 
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In the present commissioning phase, an array of three non-enriched HPGe detectors has been 
deployed in the Gerda set-up and is in data taking. The charge signals from the HPGe detectors 
are sampled by 14-bit Flash-ADCs (FADC) running at 100 MHz sampling rate and stored to disk 
for off-line analysis. For each physical trigger all the HPGe detector signals are acquired to check 
for coincidences. A second data stream of the experiment is provided by the PMT signals from the 
Cherenkov muon veto, which are digitized by the same FADCs used for the HPGe detectors. 

In this paper a data analysis framework called Gelatio (GErda LAyouT for Input/Output) is 
presented and discussed. The framework was developed to handle the full data analysis flow of 
Gerda as well as of all the R&D activities related to the experiment. The framework has been 
designed to be solid, user-friendly, flexible, maintainable over a long lifetime and scalable to the 
future phases of the experiment. Furthermore, thanks to its generic interfaces, it could be used in 
other activities involving off-line analysis of digitized pulses from HPGe detectors or other kinds 
of detectors. 

The paper is organized as follows. In section the main requirements driving the software 
design and the basic concepts of the framework are presented in detail. Section ^ describes the 
software implementation and the technical solutions pursued. A few examples about the validation 
and the application of the framework are reported in section ^ A summary and discussion of future 
plans are eventually presented in the final section. 

2. Concept and design 

Gelatio is a data analysis framework designed to provide a flexible environment and a complete 
suite of tools for off-line digital signal processing and for analysis of data recorded with HPGe 
detectors. The software aims to provide a common platform for the Gerda Collaboration to run 
the analysis of the experiment data. To meet these requirements the framework must be able to: 

• decouple the algorithm implementation from the raw data format allowing the users to run 
the same analysis algorithms on data sets acquired with different hardware and / or encoded 
in different formats; 

• perform a modular and highly customizable digital signal processing. This approach aims 
to simplify and foster as much as possible the re-use, the sharing and the comparison of 
the analysis algorithms, avoiding unnecessary duplications in the code and improving its 
validation; 

• optimize the computational performances and be cross-platform compatible, hiding the tech- 
nical aspects to the end users. 

The solution worked out is based on two paradigms which are discussed below: multiple level data 
organization and modular digital signal processing. 

2.1 Multi-level data structure 

The raw data, the information extracted by the signal processing and the analysis results are stored 
in a hierarchical structure. This approach aims to increase flexibility and enables a multi-user 
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Figure 1. The hierarchical organization of the data in Gelatio. The framework organizes the output of 
each step of the analysis in a different level (Tier) starting from the raw data (TierO) up to the condensed 
parameter of the final analysis. The Tierl contains the same information of the raw data but encoded with a 
different format based on Root [Oil and MGDO. More details can be found in sect. 3.2. 

customized data analysis. Alternative analyses can be created as forks of the default one, sharing 
part of the data flow until a given level and then creating a parallel stream of information. The 
multi-level structure includes naturally in the framework the conversion of the raw data into a 
new standardized format which is optimized for signal processing and data storage. After the 
conversion, all data can be processed along the same analysis stream independently of the parent 
data acquisition (DAQ) system data format, including data produced by Monte Carlo simulations. 

The multi-tier structure is depicted in Figure [I]. The raw data provided by the different DAQ 
systems and by the Monte Carlo simulations are stored in the lowest level ("TierO")- Data are then 
converted into a new encoding and stored as "Tierl". The first two tiers contain exactly the same 
amount of information, the only difference being that while the TierO is the native DAQ format, 
the Tierl is a standardized format that can be chosen to be solid, flexible, exportable and easily 
readable. The Tierl data are distributed to the Gerda collaborators as the starting point for the 
analysis. Higher-level tiers - which are produced from the Tierl - ai^e meant to contain the analysis 
results. The "Tier2" files store the output information obtained by applying the digital analysis 
to the individual traces of each event, as rise time, amplitude, average noise, baseline average 
value, etc. Similarly, the "Tier3" files store information extracted from the Tier2, e.g. the actual 
energy spectrum obtained by calibrating the amplitude spectrum with the appropriate calibration 
curves. As the analysis becomes more and more refined (noise rejection, pulse shape discrimination 
analysis, delayed coincidence, veto, etc.), the information can be stored in higher-level tiers. 

A drawback of this approach is the additional request for disk space due to the coexistence 
of the same information in both the TierO and the Tierl. On the other hand the raw data are not 
meant to be distributed because the Collaboration plans to blind the events with energy close to the 
region of interest (2jg^ -value of ^^Ge). Raw data will be backed up in the computing centers of 
three different institutions and used only to generate the Tierl. In the conversion process the data 
blinding is applied and only the resulting Tierl data are released to the Collaboration. 
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2.2 Modular digital signal processing 

The core of Gelatio is the digital signal processing which creates the Tier2 files starting from 
the detector signals stored in the Tierl. In this step different algorithms are applied to the signals 
in order to extract efficiently the pulse shape information, for instance maximum amplitude, rise 
time, baseline slope, etc. In y-ray spectroscopy these operations are usually performed by chains 
of elementary digital filters (differentiation, integration, deconvolution, etc.) optimized to reduce 
the noise and to extract the information with high precision. 

To support a highly customizable analysis, the design of Gelatio is based on a modular 
approach. The analysis is divided into modules, each handling a unique and consistent task of 
the digital data processing, as for instance energy reconstruction and baseline subtraction. Each 
module includes a chain of elementary digital filters which is optimized to extract the information 
of interest from the signal trace. The computed information as well as the shaped traces can be 
used as input for other modules. The list of active modules and the parameters used by the internal 
chain of filters are controlled by the end user through an appropriate ASCII initialization file (INI 
file). 

This design provides a wide flexibility as complicated chains of modules can be created by the 
user in an open and transparent way through the INI file. The same module can be run many times 
within the same execution and used in different chains, each time with different sets of parameters. 
Moreover, the user can easily create new modules implementing his own customized analysis tasks. 
The new modules are immediately available for registration in the INI file and can be combined 
with the standard ones to create new chains. The data flow and the INI file of an illustrative analysis 
are reported in Figure and Figure 0, respectively. This solution enhances also the re-use of the 
analysis algorithms and avoids code proliferation. 

3. Implementation 

The core of the framework is implemented in C++ to ensure an easy and natural interfacing with 
several scientific general-purpose projects. This choice provides also high computational perfor- 
mances, wide flexibility thanks to the object-oriented programming support, and cross-platform 
compatibility. Gelatio depends on the CLHEP [IT3I] and FFTW3 QH] libraries for scientific com- 
puting, and on the ROOT flTTH and Tam [1T5I1 libraries for the management of the modular analysis, 
the data storage and the graphical tools. All the external software packages mentioned above are 
freeware and open-source. Gelatio additionally depends on the MGDO package for the basic 
digital signal processing algorithms and for the definition of the objects used to encapsulate the 
information in the Tierl output. MGDO (Majorana-GERDA Data Objects) is a set of libraries that 
are jointly maintained and developed by the Majorana flTHl and Gerda collaborations. They are 
specifically designed to improve the encapsulation and the handling of complex data as dedicated 
C++ objects. 

Gelatio is distributed to the Gerda collaborators in the form of a source code. It can 
be compiled on any platform supporting GNU C++, including Linux and MacOS. A configure 
script takes care of setting automatically the appropriate paths and environment variables necessary 
to compile the code. The installation procedure has been successfully tested on both 32- and 64-bit 
operating systems. 
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Figure 2. Data flow of an illustrative analysis which uses three chains of modules. The first chain (blue 
arrows) reconstructs the event amplitude and includes the baseline restoration and correction for pile-up 
(BaselineModule) and the computation of the trigger time (TriggerModule). The signal shaped by Base- 
lineModule and the trigger computed by TriggerModule are used as input for EnergyGastModule, which 
reconstructs the pulse amplitude according to the Gast moving- window-deconvolution approach I II 211 . The 
second chain (red arrows) is used to estimate the rise time of the signal. The traces are first shaped by 
BaselineModule, interpolated by InterpolatingModule to push the time resolution below the sampling fre- 
quency and finally processed by RiseTimeModule. The last chain (green arrows) computes the rise time of 
the derivative of the signals (current signal) and shares the first three modules with the previous chain. The 
signal shaped by InterpolatingModule is fed to DifferentiatorModule to compute the numerical derivative, 
and eventually is parsed to a second instance of RiseTimeModule. The figure shows for each module the 
input and output trace (second and third line) and the main parameters used by the internal algorithms (last 
line). 

To ensure flexibility and good computational performances, the framework includes both com- 
piled and interpreted code. In section ^7T] and section ^3 the implementation of the two executables 
in charge for the raw data to Tierl conversion and for the actual modular data analysis (Tier2 pro- 
duction) are described in detail. Then the suite of Bash and Python scripts to handle the data 
streaming through the different tiers is presented in section P3| . Section eventually describes 
the graphical interface used to display the event traces, define the shaping parameters and create 
the INI files. 

3.1 Conversion of raw data to the analysis format 

The binary data format chosen for Tierl is a ROOT file containing a TTree of MGDO objects 
(MGTEvent and MGTRun). The MGDO objects employed in the Tierl output are containers 
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[Parameters ] 

FileList=tierl_r01 . root tierl_r02 . root 
OutputFile=output . root 
LogFile=output . log 

[TaskList] 

Task_BaselineModule_l=true 

Task_TriggerModule_l=true 

Task_EnergyGastModule_l=true 

Task_InterpolatingModule_l=true 

Task_RiseTimeModule_l=true 

Task_Dif f erentiatorModule_l=true 

Task_RiseTimeModule_2=true 

[Task_BaselineModule_l ] 

Input TraceName=originalTr ace 
Output TraceName=restoredTrace 
Base lineRe St orations tart =10 0ns 
BaselineRestorationStop=2us 
TauPreamp=4 7us 
PileUpCorrection=true 

[Task_TriggerModule_l] 
InputTraceName=restoredTrace 
NumberOf SigmaThs=3 
TimeAboveThs=100ns 
IntegrationWindowWidth=50ns 



[ Task_EnergyGastModule_l ] 
Input TraceName=restoredTrace 
Dif f erent iationWindowWidth=10us 
IntegrationWindowWidth=8us 
FlatTopPosition=0 . 4 

[Task_InterpolatingModule_l ] 
InputTraceName=restoredTrace 
Output TraceName=interpol at edlr ace 
SubSampleNumber=10 

[Task_RiseTimeModule_l] 

InputTraceName=interpolatedTrace 

LowEdge=10 

HighEdge=90 

[Task_Diff erent iatorModule_l ] 
InputTraceName=InterpolatedTrace 

OutputTraceName=diff erent i at edT race 
Dif f erent iatonWindowWidth=5 0ns 

[Task_RiseTimeModule_2 ] 
InputTraceName=diff erent i at edTr ace 
LowEdge=10 
HighEdge=90 



Figure 3. Example of an INI file implementing the analysis described in Figure 2. The INI file is organized 
in blocks. The first two blocks (Parameters and TaskList) are used to define the input and output files and 
to register the Ust of modules, respectively. The following blocks are used to define the parameters of the 
registered modules, as for instance the input and output traces. 

which encapsulate the basic information of individual events (signal traces, time stamps, DAQ 
flags, etc.) and of runs (start and stop times, run type). The usage of a ROOT files has many 
advantages, most notably the streamers of the ROOT objects, the compression routines and the 
interface to the ROOT graphic utilities. 

The conversion of raw data in the Tierl format is performed by the executable Raw2MGD0, 
which accepts a list of raw data files and lets the user customize the name and the number of the out- 
put files. The framework contains dedicated classes ("Decoders") which are used by Raw2iyiGD0 
to decode the supported binary raw files, in order to read the information to be copied to the Tierl 
structure. At the moment, six different decoders are available in Gelatio, supporting all data 
formats currently employed in the Gerda activities. The decoders take care of extracting the in- 
formation from the raw file and of all the required preprocessing - as endianness inversion - before 
writing them in the ROOT file. The Gelatio decoders inherit by the same virtual base class, in 
order to improve flexibility and to avoid code duplication. The common interface defined by the 
virtual base class eases the extension / upgrade of the present decoders as well as the implementa- 
tion of new decoders to read any other binary data format. 
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Such an approach makes the framework able to handle in a completely transparent way a data 
stream containing an unspecified number of DAQ channels, each with digitized traces of unspec- 
ified length. This is required because the number of operational detectors and the digitization pa- 
rameters will change during the experiment lifetime. Similarly, Gelatio is able to handle a mixed 
stream coming from different types of detectors (e.g. HPGe detectors and PMTs in LArGe). 
The Tierl data format is used also as output of the pulse simulation software developed in the 
framework of Gerda Consequently, the simulated traces can be treated in the same way of 
the experimental data and be processed along the same analysis flow, enabling an easy and direct 
Monte Cai^lo-to-data comparison. 

The testing and benchmarking of Gelatio was performed by using the main Gerda server. 
The server runs Scientific Linux 5.5 64-bit and mounts a Dual Xeon E5620 CPU (2x4 cores at 
2.4 GHz with 2 x 12 Mb cache), 16 GB of RAM, and 20 hard-disks (2 TB) connected through a 
SATA 3 Gb/s interface and operated in RAID6/XFS. The computational performances of the con- 
version program are affected by the encoding and type of raw data and by the ROOT compression 
options required. For instance, a typical Gerda calibration run (about 3.5 • 10^ waveforms, each 
having 4096 samples, total size about 30 GB) is completely converted into the Tierl format by the 
reference machine in about 100 minutes using a single thread. The processing time is substantially 
reduced if the ROOT compression option is switched off, at the expense of additional disk space. It 
has to be emphasized that the conversion of raw data into the Tierl format must be done only once, 
so the best compromise is usually to pay in CPU computing time to obtain a smaller output. 

3.2 Implementation of digital signal processing 

To implement the modular analysis following the design discussed in section the framework 
relies on the Tree- Analysis Module (Tam) package. Tam [11311 is a free package for ROOT devel- 
oped to provide a very general and modular interface for analyzing data stored in a TTree. The 
software combines the features of two ROOT objects: the TSelector method for processing trees 
and the TTask for handling a hierarchical structure of modules in a user-transpai^ent way. 

In Gelatio each analysis module is a concrete class derived by the basic interface T AMo du 1 e 
provided by Tam via an additional GELATio-specific base class named GERDAModule. Tam is 
used to handle the event loading from the Tierl file, the exchange of information among different 
modules and the object output list. Moreover, the interfacing with Tam ensures the compatibility 
with the Root extension PROOF (Parallel ROOT Facility) lUTH enabling the softwai'e to run several 
threads in pai^allel. 

The executable in chai^ge of the Tier2 creation takes care to instantiate the Tam interface - 
initializing the analysis modules according to the instruction provided through the INI file - and 
to store all the outputs of the same execution in a single ROOT file. The output is a collection of 
Root objects usually containing a TTree for each module but also histograms or signal traces. 
The software provides also a master TTree which can be used for unrestricted and parallel access 
to information contained in any other TTree in the file, via the ROOT friendship mechanism. 

The CPU time required to run an analysis depends on the number of active channels and mod- 
ules as well as on the module parameters. For instance, the standard Gerda analysis chain includes 
baseline restoration, trigger position and rise time computation, and two independent modules for 
amplitude reconstruction. A typical calibration run containing 3.5 • 10^ traces, each 4096 samples 
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Figure 4. Data flow of flie information through the different tiers performed by using the suite of utiHties 
(blue nodes) included in Gelatio. These utilities are designed to provide an interface to import raw data in 
the framework, convert them into the Tierl format, create the INI file and run the digital signal processing. 
Moreover, the utilities allow the user to perform an interactive and graphical calibration of the amplitude 
spectra. 

long, is processed according to the Gerda standard analysis chain in less than 4 hours by using a 
single thread of the reference benchmark machine. 

3.3 Utilities 

To help the handling of the data stream through the different tiers, Gelatio includes a suite of 
utilities implemented as Bash and Python scripts. The utilities work over a well-defined directory 
structure ("analysis file system") and provide a user-interface for each step of the analysis. The 
scripts take care of identifying which files should be processed and of storing the results of each 
step in the proper directory, including a log file collecting the standard outputs generated by the 
executables. Since the information is stored in fixed directories inside the file system, each user is 
immediately able to recover any output. 

In Figure 0, each step of the data flow is depicted together with the associated utility. The 
utilities up to the Tier2 builder are implemented as simple Bash scripts and are designed to provide 
an interface to the file system and to run the Gelatio executables with the proper options. The 
last two routines are more complicated as they are supposed to provide to the user an interactive 
graphical tool to calibrate the energy spectra and to create the Tier3. Moreover, they take care of 
storing the calibrating parameters of each channel in different directories of the output TFile, 
together with all the information important for the debugging, i.e. the calibration log files and the 
plots of the fits. These utilities are implemented in Python to take advantage of the ROOT binding 
PyROOT which enables cross-calls from Python to ROOT/CINT O- 

3.4 The graphical interface 

The graphical user interface (GUI) integrated in GELATIO (Figure ^ is a general and powerful tool 
developed to create and handle INI files. The interface is implemented entirely by using ROOT 



-8- 




(a) 



(b) 




1, InputWavef ormName=Subt tract edPulse 
BaselineStart=100ns 
BaselinsStop-lSDOOiis 
NumberOfSiqmaThs=J 
TimeAbovelh-s-iaOOOiis 
MWAuerageWidth=100ns 

sk GERDAFTTriqqerModule] 
Modu 1 e^GERDA F T T r iqg e rModu 1 e 
Ver-bosityLeuel-O 
I n put Wave f o r inN attie - T r e e Wa V e f o rm 
BaselineStart^lOOus 
BassllneStop=5Ci00ns 
MWDeconvolutionWidth^l 000ns 
MWAVGt-3geWidth=l 000ns 
WumberOiSiainaThs=5 
TimeAboveThs=1500nE 

sk GERDAEnergyGastFilterModule] 
Module=GERD£.EnergvGastFi IteirModule 
VerbositvLevel=0 
I n put Wav e f o r iiiN aitie - T r e e Wa V e f o m 
BaselineStart=l eOOOns 
BaselinsStop-19000iis 
MWDeconvolutionWidth=10us 
MWAv e t-ageWi dt hOiiD e coiivo 1 v ed - 8 u s 
PI FlatTopPosition^O . 8 



(d) 



Figure 5. Screen shots of the Gelatio GUI. The input Tierl file comes from a Gerda background run and 
contains three traces per event. The screen shots show the tools and utilities available in the GUI: (a) Event 
displayer. The signals from the three channels are displayed together, (b) INI file editor. It can be used to 
select and customize the analysis tasks to be performed. The "Module list" contains all the analysis modules 
available in GELATIO. (c) INI output summary. It shows the human-readable INI file produced according to 
the user choices in the INI file editor, (d) Event analyzer. To apply the full analysis chain to a given trace. 
The screen shot shows the intermediate shaped traces calculated by the analysis module which implements 
the Gast [ Ql algorithm for amplitude reconstruction. 

graphical components. Despite the lack of flexibility and the intrinsic limitations of the ROOT 
graphical libraries, the native ROOT solution has been preferred to ensure smooth integration with 
the rest of the framework and to minimize external dependencies. 

The GUI aims to help the user in the creation and in the testing of the INI files and is able to 
handle multiple files at the same time. The layout is based on five main tabs: 

• Files: to select the Tierl files to analyze. The files can be selected by using a graphical 
window, and multiple files can be selected at the same time. 

• Event viewer (Figure ^.a): to browse channel-by-channel the events contained in the selected 
Tierl files. Several channels can be displayed at the same time. 
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• Modules (Figure 0.b): to configure the tasks to be used for the analysis. A "task" is a module 
with a particular set of parameters. The modules to be activated can be selected from a list. 
For each module the GUI displays the full list of the customizable parameters which can 
be edited interactively by the user. Since modules can be possibly configured with many 
independent parameters, it is particularly useful to have a visual list of them and of their 
default values. For each parameter the GUI shows a color-code label. The green code means 
that the selected value is equal to the default; the yellow code stands for a valid value which 
is different from the default; the red code indicates an invalid parameter value (wrong unit, 
for instance). 

• Summary (Figure ^.c): to view the resulting INI file. 

• Event Analyzer (Figure ^d): to test the INI file, namely the analysis chain, on a single 
event in the Tierl file. It is possible to display the input /output traces of each module and 
also the intermediate traces produced along the digital processing. Such a tool proved to 
be very useful for debugging the analysis chain and for tuning / optimizing the values of the 
parameters to be used for a given data set. 

4. Application and Benchmarking 

The framework has been used up to now to handle and analyze data from several Gerda -related 
activities, including calibrations with radioactive sources. In particular, Gelatio was used for 
the data analysis of the Gerda R&D activities related to the Broad Energy Germanium (BEGe) 
detectors. The experimental data flTEI . ITPl EDU and the corresponding Monte Carlo simulations |]7|] 
were processed on the same footing to compare directly the results. The pulse shape discrimination 
algorithms developed for the BEGe detectors [0, EH were coded as dedicated Gelatio modules 
and successfully applied to data. Results in term of discrimination efficiency calculated with the 
Gelatio modules aie consistent with those obtained with the dedicated code of Ref. [Q]. The 
present analysis of the LArGe data is based on GELATIO; in this case, the framework is able 
to handle the data streams coming from the HPGe detector and from the PMTs of the instrumented 
liquid ai^gon veto. Finally, the framework is currently used for the reference data analysis of the 
data collected in the Gerda commissioning with three HPGe detectors. 

Up to now Gelatio was used on data files coming from six independent DAQ systems, dif- 
fering for binary data format, number of channels, sampling frequency and sampling window. It 
proved to be able to handle correctly multiple DAQ channels - possibly referred to different kinds 
of detectors - and pile-up corrections that must be applied for source calibration runs, because of 
the higher counting rate. The software turned out to be stable and robust. Fixes for the few mi- 
nor bugs reported since the release of the stable version are made available in regularly-updated 
versions of Gelatio. 

Figure ^ shows the distribution of the trace trigger time (upper panel) and the amplitude spec- 
trum (lower panel) of one of the HPGe detectors deployed in Gerda irradiated with a ^^^Th cal- 
ibration source. The distributions are obtained at the end of the GELATlO-based analysis flow, 
using the reference INI file described in section The amplitude spectrum of Figure ^ has been 
produced after having discarded those events having a reconstructed trigger time significantly far 
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Figure 6. Analysis results obtained by the Gelatio processing of a Th calibration taken in Gerda. 
Upper panel: distribution of the reconstructed trigger time for one of the three detectors. The DAQ chain is 
set in order to have the trace at about 120 /zs after the start of the sampling window. Lower panel: amplitude 
distribution obtained for the same data set from the Gaussian shaping algorithm. The amplitude spectrum 
includes only the signals having the trigger time between 120 /is and 125 /is (see dashed lines in the upper 
panel) in order to discard accidental and mis-reconstructed events. The management of the GELATIO output 
is performed via the master TTree in the Tier2 file. 

from the expected value. Notice that trigger times and trace amplitudes are calculated by two in- 
dependent analysis modules and stored in two separate TTrees in the Tier2 file. The amplitude 
and trigger information from the different TTrees is correlated via the master TTree created by 
Gelatio. 

The amplitude was reconstructed off-line using two independent algorithms - Gaussian shap- 
ing (shown in Figure ^ and Gast method WHU - both implemented as Gelatio modules. After 
the appropriate tuning of the parameters, the two methods give equivalent results for the energy 
resolution. The energy reconstruction and resolution provided by Gelatio for the Gerda data 
were compared to the results obtained with an independent and dedicated analysis code and found 
to be consistent. 

5. Conclusions 

A powerful and flexible software framework called Gelatio has been developed to handle the full 
analysis flow of the Gerda experiment and the related R&D activities. The software is written in 
Ch~i- and is based on an object oriented design. Gelatio contains executable programs and utility 
scripts which take care of the full analysis chain, starting from the raw data up to the final condensed 
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parameters. Complex analysis, possibly involving energy reconstruction, detector coincidence and 
pulse shape discrimination, can be hence managed in a very transparent and general way. 

As raw data are converted in a standardized RoOT-based format, data streams originated by 
the different DAQ systems used in the Gerda and by Monte Carlo simulations can be treated with 
the same algorithms and along the same analysis flow, easing the cross-check and inter-comparison 
among the results. 

A modular analysis approach based on Tam is used, designed to sub-divide the analysis work 
in many nearly-independent tasks that can be activated interactively and possibly run in parallel in 
multi-thread systems. The active modules and the corresponding parameters are selected by the end 
user via a human-readable INI file or a graphical interface. The GUI acts also by event displayer 
and by interactive analysis manager, able to show each intermediate step of the modular analysis. 

A stable version of Gelatio is presently available for the Gerda Collaboration and the 
framework has been widely used for the data analysis in the Gerda commissioning. The Gerda 
database system, which is currently under development, supports Gelatio in input and output. 
It is able to import the analysis results from Tier2 files and to generate Tierl files from a custom 
selection of events made through SQL queries. Gelatio is also used in other GERDA-related 
activities, including LArGe and the characterization of prototype BEGe detectors. 

The software has been validated against other independent and dedicated analysis codes. Fur- 
thermore, Gelatio proved to be robust, effective and flexible enough to handle a complex analysis 
stream from a real-life multi-channel experiment. Gelatio could be used in any experimental ac- 
tivities involving digital pulse shape analysis of HPGe detector signals. 
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